Analysis of Job Postings for Entry-Level Professionals. AINA SYAKIRAH BINTI JAMIL

Insights gathered from a recent analysis of job postings on well-known platforms like LinkedIn and JobStreet cover the period from November 2023 to January 2024. We're concentrating on opportunities for those starting their careers in different industries across Malaysia. By knowing the current job trends and requirements is the key for making rightful decisions and navigating to successful job searches.

OBJECTIVE

Support New Professionals: Provide entry-level workers with insights based on data to help them make decisions in today's job market.

Highlight In-Demand Skills: Identify the skills that are in high demand across various industries and clarify which ones professionals should pay attention to.

Understand Job Trends by Location: Look into the differences in job opportunities among various cities and states, providing important insights about different regions.

Forecasting Job Postings: Predict how job postings will change over time using ARIMAX Models and ARIMA Models.

DATA COLLECTION

Data was collected from LinkedIn and JobStreet websites. Manual collection was done for LinkedIn, and JobStreet data was retrieved using an API. The dataset includes job postings from a variety of companies over three months, starting from November 2023 to January 2024.

Figure 1: From the left side shows data collected from LinkedIn, and the right data from Jobstreet.

DATA CLEANING

By combining information from both websites to make the data-cleaning process more organized. The final dataset is structured into five columns: 'Job Title,' 'Industry,' 'Location,' 'Date Post,' and 'Job Skills.' To ensure accurate data, data preprocessing phase was made. This includes handling missing values, standardizing text data and rectifying any discrepancy in the dataset. This process improves the quality of the data, making it more constant for our analysis.

Figure 2: Information of Dataset

DATA ANALYSIS

Figure 4, shows a histogram bar of job postings that are distributed over the three months based on the dataset collected. From the bar, there's a increase in job postings around mid-December, followed by a decline. However, there's a distinct rise in job postings is evident around mid-January.

Figure 4: Histogram show the distribution of Job Posting between November 2023 to January 2024.

We employed a Word Cloud to showcase the most frequently mentioned words in job skills. This visual representation gives us a glance into the job skills that are frequently highlighted in job postings.

Figure 5: Shows most common words in Job Skills.

Figure 6, we present a bar chart illustrating the top five skills, categorized by different industries. Unlike the Word Cloud, this chart specifically breaks down the most crucial skills based on industry, providing a clearer understanding of the skill requirements in each sector.

Figure 6: Top 5 skills from different Industry

Figure 7, shows a map of Malaysia that visually represents the trends in job vacancies across different regions in the country. The map highlights a significant concentration of job positions in key areas such as Kuala Lumpur (KL), Selangor, Penang, and Johor Bahru (JB). This visualization offers a clear overview of where job opportunities are prominently located across Malaysia.

Figure 7: Map of Malaysia, showcasing how job postings are spread across different states in Malaysia

DATA MODELLING

We aim to predict how job postings will change over time. To do this, we'll use special models called SARIMAX Models and ARIMA Models. These models help us make predictions based on the patterns we find in the data.

Figure 8: Forecasting Job Postings Over Time with SARIMAX Model
Figure 9: Forecasting Job Postings Over Time with ARIMA Model

RESULT

Figure 10: Show the total MAE & MSE for both SARIMAX (right) and ARIMA (left).

By observation, we notice that the ARIMA model shows smaller MAE and MSE values than the SARIMAX model. When these values are lower, it means the model is more accurate and performs better. The ARIMA model does better than SARIMAX because it gives more accurate predictions and has smaller errors. So, overall, ARIMA is a more effective model based on these measures.

CONCLUSION

This data science project involved analyzing job postings collected from LinkedIn and JobStreet between November 2023 and January 2024. The dataset included information such as job titles, industries, locations, posting dates, and required job skills. The analysis revealed key insights, including the in-demand skills, popular industries, and hotspots for job opportunities in Malaysia. The presentation aimed to give insight to entry-level professionals with information to enhance their job search strategies. For data modelling, ARIMA is more effective model to use when forecasting Job Postings Over Time.